Loosely Tree-Based Alignment for Machine Translation
نویسنده
چکیده
We augment a model of translation based on re-ordering nodes in syntactic trees in order to allow alignments not conforming to the original tree structure, while keeping computational complexity polynomial in the sentence length. This is done by adding a new subtree cloning operation to either tree-to-string or tree-to-tree alignment algorithms.
منابع مشابه
An Algorithm for Word-Level Alignment of Parallel Dependency Trees1
Structural divergence presents a challenge to the use of syntax in statistical machine translation. We address this problem with a new algorithm for alignment of loosely matched non-isomorphic dependency trees. The algorithm selectively relaxes the constraints of the two tree structures while keeping computational complexity polynomial in the length of the sentences. Experimentation with a larg...
متن کاملA non-contiguous Tree Sequence Alignment-based Model for Statistical Machine Translation
The tree sequence based translation model allows the violation of syntactic boundaries in a rule to capture non-syntactic phrases, where a tree sequence is a contiguous sequence of subtrees. This paper goes further to present a translation model based on non-contiguous tree sequence alignment, where a non-contiguous tree sequence is a sequence of sub-trees and gaps. Compared with the contiguous...
متن کاملTree-to-String Alignment Template for Statistical Machine Translation
We present a novel translation model based on tree-to-string alignment template (TAT) which describes the alignment between a source parse tree and a target string. A TAT is capable of generating both terminals and non-terminals and performing reordering at both low and high levels. The model is linguistically syntaxbased because TATs are extracted automatically from word-aligned, source side p...
متن کاملUsing Tectogrammatical Alignment in Phrase-Based Machine Translation
In this paper, we describe an experiment whose goal is to improve the quality of machine translation. Phrase-based machine translation, which is the state-of-the-art in the field of statistical machine translation, learns its phrase tables from large parallel corpora, which have to be aligned on the word level. The most common word-alignment tool is GIZA++. It is very universal and language ind...
متن کاملUnsupervised Sub-tree Alignment for Tree-to-Tree Translation
This article presents a probabilistic sub-tree alignment model and its application to tree-to-tree machine translation. Unlike previous work, we do not resort to surface heuristics or expensive annotated data, but instead derive an unsupervised model to infer the syntactic correspondence between two languages. More importantly, the developed model is syntactically-motivated and does not rely on...
متن کامل